Common variants of pro-inflammatory gene IL1B and interactions with PPP1R13L and POLR1G in relation to lung cancer among Northeast Chinese

Lung cancer is a complex disease influenced by a variety of genetic and environmental factors. The cytokine interleukin 1 encoded by IL1B is an important mediator of the inflammatory response, and is involved in a variety of cellular activities. The effect of single nucleotide polymorphisms (SNP) at IL1B has been investigated in relation to cancer with inconsistent results. This Northeastern-Chinese case–control study involving 627 cases and 633 controls evaluated the role of three haplotype-tagging single nucleotide polymorphisms (htSNP) (rs1143633, rs3136558 and rs1143630) representing 95% of the common haplotype diversity across the IL1B gene and assessed interactions with IL1B, PPP1R13L, POLR1G and smoking duration in relation to lung cancer risk. The analyses of five genetic models showed associations with lung cancer risk for rs1143633 in the dominant model [adjusted-OR (95% CI) = 0.67 (0.52–0.85), P = 0.0012] and rs3136558 in the recessive model [adjusted-OR (95% CI) = 1.44 (1.05–1.98), P = 0.025]. Haplotype4 was associated with increased lung cancer risk [adjusted-OR (95% CI) = 1.55 (1.07–2.24), P = 0.021]. The variant G-allele of rs1143633 was protective in smoking sub-group of > 20 years. Using multifactor dimensionality reduction (MDR) analyses, we identified the three best candidate models of interactions and smoking-duration or IL1B rs1143633 as main effect. In conclusion, our findings suggest that IL1B SNP rs1143633 may associate with lower risk of lung cancer, confirming previously identified marker; IL1B SNP rs3136558 and haplotype4 consisting of IL1B htSNPs may associate with increasing risk of lung cancer; interactions of IL1B with POLR1G or PPP1R13L or smoking-duration, which is independent or combined, may involve in risk of lung cancer and lung squamous cell carcinoma.

Lung cancer is an important and prevalent cause of cancer-related death worldwide and constitutes a serious public health problem. Lung cancer is a complex disease influenced by a variety of genetic and environmental factors. Susceptibility gene/single nucleotide polymorphisms (SNP) have been linked to lung cancer risks. Tobacco remains the leading risk factor for lung cancer 1 . Chronic lung diseases that entail chronic inflammation have been suspected to play a role in the pathogenesis of lung cancer 1 . Another possible mechanism may involve gene-gene or gene-environment interactions in relation to lung cancer 2 .

Characteristics Cases n (%) Controls n (%) P value
Selected IL1B htSNPs and lung cancer risk. Genotype distributions and lung cancer risk for the three IL1B htSNPs in co-dominant, dominant, recessive, over-dominant and log-additive models after adjustment smoking-duration were analyzed (Table 3). For whole study group, rs1143633 in the dominant model [Odd Ratio (95% confidence interval): adjusted-OR (95% CI) = 0.67 (0.52-0.85), P = 0.0012] (and also including codominant model and log-additive model) and rs3136558 in the recessive model [adjusted-OR (95% CI) = 1.44 (1.05-1.98), P = 0.025] (and also including co-dominant model) showed in association with lung cancer risk. No significant models with association were found for rs1143630. For subgroup stratified by smoking duration, G variant-allele of rs1143633 showed the protective effect in > 20 years subgroup (Table 4). No significant associations were found for two other htSNPs (data not shown). For subgroup stratified by histopathology, rs1143633 in the log-additive model [adjusted-OR (95% CI) = 0.59 (0.47-0.75), P < 0.0001] and rs3136558 in the log-additive model [adjusted-OR (95% CI) = 1.35 (1.08-1.69), P = 0.0086] were associated with the disease risk in the subgroup of lung squamous cell carcinoma (Table 3). No significant association was found for subgroups of lung adenocarcinoma and other histoathology (data not shown).

Analysis of linkage disequilibrium (LD) and haplotype. Linkage disequilibrium analysis was exam-
ined. The analyses showed that LD was moderate between rs1143633 and rs3136558 (D′ value = 0.5814) and between rs3136558 and rs1143630 (D′ value = 0.462), and very low between rs1143633 and rs1143630 (D′ value = 0.0701) in present population (Table 5). Haplotype association analysis of IL1B htSNPs with lung cancer risk showed that haplotype4 (rs1143633 A -rs3136558 C -rs1143630 A ) was associated with increased risk of lung cancer after adjustment for smoking duration [adjusted-OR (95% CI) = 1.55 (1.07-2.24), P = 0.021] and that haplotype2 was marginally associated with increased risk of lung cancer ( Table 6). Table 7 summarizes the best candidate models of interactions of selected attributes related to lung cancer risk using MDR approach. Three best models were set for whole study group: in the combined interaction analysis of IL1B, PPP1R13L, POLR1G and smoking-duration, both the two-factor model and the three-factor model were statistically significant, but the three-factor model (IL1B rs3136558, POLR1G rs967591 and smoking duration) had a relatively higher values of balanced accuracy overall of 0.6062 and cross-validation consistency of 8/10 that was significant at the P-value 0.0100-0.0110. In the conjoined interaction analysis of IL1B and smoking-duration, two-factor model, the three-factor model and the fourth-factor model were all statistically significant, but the fourth-factor model (IL1B rs1143633, rs3136558 and rs1143630 and smoking duration) had a relatively higher values of balanced accuracy overall of 0.6112 and cross-validation consistency of 10/10 that was significant at the P-value 0.0040-0.0050. In the joint interaction analysis of IL1B, PPP1R13L, POLR1G, both the two-factor model and the three-factor model were statistically significant, but the three-factor model (IL1B rs1143633, PPP1R13L rs1970764, POLR1G rs735482) had a relatively higher values of balanced accuracy overall of 0.5973 and crossvalidation consistency of 10/10 that was significant at the P-value 0.0130-0.0140. Smoking-duration presented interaction main-effect in model consisting of IL1B htSNPs-PPP1R13L and POLR1G SNPs-smoking duration or IL1B htSNPs-smoking duration. IL1B rs1143633 presented interaction main-effect in model consisting of IL1B htSNPs. For histopathology study subgroup: only conjoined interaction analysis of IL1B and smoking-duration was performed. In the subgroup of squamous cell carcinoma, both the two-factor model (IL1B rs1143633 and www.nature.com/scientificreports/ smoking duration) and the four-factor model (IL1B rs1143633, rs3136558 and rs1143630 and smoking duration) had relatively higher values of balanced accuracy overall of and cross-validation consistency of 10/10 that were statistically significant at the P-values, and smoking duration showed obvious main effects. In the subgroup of other histopathology, the two-factor model had statistical significance. In the subgroup of lung adenocarcinoma, no interaction was identified. Figure 1 shows the interaction entropy model from the interaction analysis of IL1B htSNPs, PPP1R13L and POLR1G SNPs and smoking-duration built using the MDR software. The entropy-based model indicated that some values between 7 attribute interaction presented medium-level interaction or independence whereas degree of synergy interaction was not apparent in the current analysis between 7 attributes.    10 . A Caucasian-American community-based case-control study (The Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening) reported the common rs16944 and rs1143634 SNPs of IL1B did not seem to play a role in prostate cancer risk 11 . A Chinese case-control study reported that IL1B SNP rs1143634 was not associated with gastric cancer risk 12 . A Asian-Chinese case-control study of inflammation-related genes involved in wound healing reported that studied IL1B SNPs were not associated with oesophageal squamous cell carcinoma 13 .

Main findings, implications and strengths of study.
In this Northeastern-Chinese case-control study, we examined three htSNPs tagging 95% of the haplotyping diversity of IL1B known to be involved in the inflammatory response and associated with cancer risks in previously studies. To the best of our knowledge, this is the first study to evaluate three htSNPs tagging 95% of the haplotyping diversity of IL1B and to assess specific interactions between IL1B htSNPs, PPP1R13L, POLR1G risk SNPs and smoking-duration in relation to a lung cancer risk.
From a classical case-control approach in whole study, our main finding is that variant G-allele of IL1B SNP rs1143633 (A>G) associated with lower risk of lung cancer under dominant model and that variant C-allele of IL1B SNP rs3136558 (T>C) was at increased risk of lung cancer. When stratified by smoking-duration, IL1B SNP rs1143633 was specifically associated with lung cancer risk among long-term smokers (> 20 years). When stratified by histopathology, it should be noted that the studied IL1B htSNPs were only associated with risks among patients with lung squamous cell carcinoma and not among patients with lung adenocarcinoma. These results again suggest that the pathogenesis of the two subtypes may be different in genetic factors and gene changes 21 . The haplotype analysis of IL1B three htSNPs revealed positive association with lung cancer risk for the haplotype4 encompassing the variant alleles of rs3136558 and rs1143630. Haplotype2, which also encompasses the wild-type allele of rs1143633 and the variant allele of rs3136558, was also marginally associated with increased lung cancer risk. The other haplotypes encompassing the variant allele of rs1143630 were not associated with lung cancer risk. This suggests that haplotypes2 and 4 are in linkage with the functional IL1B polymorphism. The present result for IL1B SNP rs1143633 replicates for the finding from hepatocellular carcinoma study in Asians-Korean 9 .

Figure 1.
Interaction entropy model. This graphical model, describes the percent entropy that is explained by each selected attributes or pair-wise combination in our study population. Positive percent entropy indicates information gain (IG) or synergy and negative percent indicates lack of information gain (IG) or redundancy. Schematic coloration used in the visualization tools represents a continuum from synergy (i.e. non-additive) to redundancy. Red represents a high degree of synergy interaction, orange a lesser degree (both colors are not apparent in the current analysis), brown represents medium-level interaction or independence; and green and blue represent redundancy between the markers. This image was created by MDR software (3.0.3. dev. Jar) (https:// sourc eforge. net/ proje cts/ mdr/) 36  We did not correct for multiple testing. In this study, we included 3 SNPs, and thus, it could be argued that the threshold for statistical significance should be 0.05/3 = 0.0167. However, the current study is hypothesis driven, and the SNPs were selected to be in linkage disequilibrium with 95% of the genetic variation in IL1B.
The integration of genetic variants in risk prediction models beyond the traditional epidemiological covariates has been considered as the way forward in lung cancer risk prediction modeling 22 . Using a MDR approach: for whole study group, smoking-duration or IL1B rs1143633 was observed respectively as single main effect in one-factor model. The values of balanced accuracy overall and cross-validation consistency raised along with the increasing number of the factors. This phenomenon indicates the presence of interaction, meaning that the effect change of smoking-duration or rs1143633 at different levels depends on the level of another or several factors. Its existence shows that the effects of several factors studied simultaneously are not independent of each other. Special interactions between smoking duration, IL1B rs3136558 and POLR1G rs967591; smoking-duration, IL1B rs1143633, rs3136558 and rs1143630; and IL1B rs1143633, PPP1R13L rs1970764 and POLR1G rs735482 were observed in relation to lung cancer risk. The medium-level interactions were found between most markers. For histopathology study subgroup: smoking-duration was as main effect and positive interactions were only seen in subgroup of lung squamous cell carcinoma. These results add new evidence to our previous study 17 . The results again suggests that lung squmacarcinoma cell carcinoma displays the strongest relation with tobaccosmoking than lung adenocarcinoma 17 . Overall MDR results show that smoking duration as the main effect and the interactions between IL1B htSNP and PPP1R13L SNP and POLR1G SNP and smoking duration play critical roles in the occurrence of lung cancer and lung squamous cell carcinoma.
There is evidence of causal relationships between chronic infection, inflammation, and cancer 23 . An inflammatory microenvironment is an essential component of the tumor microenvironment (TME) 20 . The lung presents a unique milieu in which tumors progress in collusion with the TME. Inflammation plays an important role in the pathogenesis of lung cancer, and pulmonary disorders in lung cancer patients such as chronic obstructive pulmonary disease (COPD) and emphysema, constitute co-morbid conditions and are independent risk factors for lung cancer 24 . Chronic inflammation is a key feature of COPD and could be a potential driver of lung cancer development 25 . The chronic inflammatory microenvironment is associated with the release of various pro-inflammatory and oncogenic mediators including cytokines IL1B 20 . Excessive and uncontrolled releases of pro-inflammatory cytokines such as IL1B were increased in severe corona-virus disease 2019 (COVID-19) patients 26 . Environmental and occupational toxicants may induce pulmonary inflammation [27][28][29] . Chronic inflammation has been linked to several human diseases and also to initiation and promotion of cancer. High-expression of the promoter of IL1B SNP rs1143627 (− 31T>C) was induced in the human lung epithelial NCI-H2009 cells (Human lung adenocarcinoma cell line) treated with cigarette-smoke condensate 30 . Release of inflammasome products, such as IL1B and cytokine storms are hallmarks of COVID-19 infection and smoking may critically exacerbate COVID-19-related inflammation 31 . Present interaction study have added evidence that related to inflammation and immunity IL1B, which is independent or combined with other factors such as smoking, is involved in lung cancer risk.

Potential functional roles of selected IL1B htSNPs. SNP Function Prediction (FuncPred) 32 indicated
that IL1B rs1143630 has significant conservation score = 0.004 in three htSNP analyses about nsSNP (non-synonymous coding SNPs), splicing regulation, stop Codon, polyphen prediction, SNPs3D prediction, TFBS (transcription factor-binding site) prediction, miRNA binding site prediction, regulatory potential score, and conservation score by present data of SNPinfo Web Server. However we observed rs1143633 was the most important htSNP in this study, and we identified haplotype4 as a candidate to be in linkage with the functional SNP. Several functional SNPs have been identified in IL1B 14 .
Limitations. We have several study limitations. Power-test analyses for current study showed that for rs1143633, we had 91% chance of detecting OR = 0.67 at the 0.05 significant level using two-sided tests under the dominant model, showing that the sample size is reasonable and can meet the reasonable confidence level conditions. We had 80% or 81% chance for rs3136558 or rs1143630 respectively, detecting OR = 1.4 at the 0.05 significant level using two-sided tests under the dominant model, indicating that further larger population-based studies are warranted to confirm present findings. In addition, the haplotype analysis suggests that the studied SNPs are in linkage with the functional polymorphism. Thus functional studies of the polymorphisms under study would reveal whether the polymorphisms are functional or whether the observed associations are due to linkage with the functional polymorphism. Although the current research improves the efficiency of controlling confounding factors by matching age, sex and ethnic between cases and controls, which cannot directly control other confounding factors such as smoking-duration.

Conclusions
Our findings suggest that IL1B SNP rs1143633 may associate with lower risk of lung cancer, confirming previously identified marker; IL1B SNP rs3136558 and haplotype4 consisting IL1B htSNPs (rs1143633 A -rs3136558 C -rs1143630 A ) may associate with increased risk of lung cancer; interactions of IL1B with POLR1G or PPP1R13L or smoking-duration, which is independent or combined, may involve in risk of lung cancer and lung squamous cell carcinoma. These interesting findings should be sought in further validation with larger prospective cohorts. These could be used as a clinical biomarkers in lung cancer.

Materials and methods
Study population. 1260 individuals were enrolled in this hospital-based case-control study including 627 cases and 633 controls as previously report 18 . The patients with lung cancer were diagnosed based on standard clinical and histological criteria. Eligible cases were previously untreated (no chemotherapy or radiotherapy for cancer prior to recruitment). Cancer-free controls (matched on: sex same, age ± 3 years and ethnic same) were selected from the orthopedics wards in the same area. Demographic and covariate data were obtained from medical records and questionnaires by personal interview with professional physicians. All participants were unrelated ethnic Han Chinese from Northeast China. Stratification criteria were defined as follows: age (10 years intervals), sex, family history of cancer, smoking duration (20 years intervals) and histopathology (3 subgroups Table 2 displays the information of IL1B three htSNPs and three risk SNPs in Chr19q13.3 sub-region. Three risk SNPs of Chr19q13.3 were previously reported 16,18,33 . The data of three risk SNPs in PPP1R13L and POLR1G were used for analyses of gene-gene and gene-gene-environment interaction in current study. DNA isolation and genotyping. A volume of 5 mL of whole blood with ethylenediamine tetraacetic acid (EDTA) anticoagulation was taken from each volunteer. Genomic DNA of whole blood samples was drawn with the Puregene DNA Isolation Kit or FlexiGene DNA kit 250 (Gentra Systems, Minneapolis, MN, USA or Qiagen, Germany) following the product' s instructions. Genotyping of rs1143633 (A>G), rs3136558 (T>C) and rs1143630 (C>A) of the IL1B gene was executed with the genotyping assay of ligase detection reaction coupled with polymerase chain reaction (LDR-PCR) as previously published 34 in Shanghai Generay Biotechnology Co. Ltd. (P. R. China). Genotypes of PPP1R13L rs1970764 (A>G) and POLR1G rs967591 (G>A) and rs735482 (A>C) have been previously reported 16 . The software of Primer Premier 5.0 was used for design primers. The sequences (5′-3′) of primers and probes of IL1B three htSNPs are displayed in Table 8. Each group of LDR probes consisted of 1 common probe and 2 discriminating probes for the 2 alleles. For the PCR reactions, the DNA concentration was 50 ng-100 ng/μL and DNA purity was OD260/OD280 = 1.8-2.0. The genotyping procedure was in summary: performed PCR reactions, completed LDR reactions and sequenced LDR products. The genotyping call-rate was 96.35% for the IL1B three htSNPs. As quality control: pure water was used as negative control and 20% samples including cases and controls were genotyped twice, yielding 100% identical results.   www.nature.com/scientificreports/ LD, unconditional logistic regression for measurement of OR (95% CI) after adjusting smoking-duration, The Shapiro-Wilk test and Mann-Whitney U test were explored employing SPSS© v16.0 (SPSS Inc, Chicago, IL, USA) or SNPStats program 35 . Akaike's Information Criterion (AIC) is a standard to measure the goodness of fit for statistical model. AIC criterion was used: give priority to model with the lowest AIC value 35 . Haplotypes with frequency < 0.01 among both cases and controls were excluded from the analysis. The interaction analyses of gene-gene and gene-gene-smoking duration in relation to lung cancer risk were conducted employing platform of MDR. This software (3.0.3. dev. Jar) 36 is an updated version where permutation testing has been added into the main MDR program. The MDR method is nonparametric and free model. MDR has rational power for identifying interactions between two or more loci in relatively small samples. MDR has excellent power for identifying high-order gene-gene interactions. MDR can be directly used to case-control and discordant-sib-pair studies 36 . MDR conducts selection and evaluation of model by cross-validation and permutation-test. Balanced accuracy cross-validation training and balanced accuracy cross-validation testing indicate the accuracy rate in the training set and the testing set, respectively. Whose range is 0-1. The larger the number, the higher the accuracy rate. Cross-validation consistency indicates the consistency rate of cross-validation. Permutation-test = 1000 was set according to the instructions 36 . The P value was less than 0.05 was considered statistically significant. The possible functionality of IL1B three htSNPs was assessed using the web tool: SNPinfo 31

Data availability
The information of selected htSNPs across the IL1B gene was from the dbSNP database: https:// www. ncbi. nlm. nih. gov/ snp/? term= rs114 3633, rs3136558 and rs1143630. All data generated during this study are included in this published article. The data that support the findings of this study are available from the corresponding author upon reasonable request.